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We show that neural network classifiers can be helpful in discriminating Higgs production events from the huge 
background at LHC, assuming the case of a mass value Mg ~ 200 GeV. We use the high performance neurochip 
TOTEM, trained by the Reactive Tabu Search algorithm (RTS), which could be used for on-line purposes. Two 
different sets of input variables are compared. 



1. Introduction 

The Standard Model of elementary particle 
physics (SM) has been highly successfully sus- 
tained by a lot of experimental data up to now. 
In particular after the recent discovery of the top 
quark all its elementary building blocks have got a 
solid experimental confirmation, except the Higgs 
boson, albeit its essential role in the model. In 
fact, as it is well known, it provides the mecha- 
nism for breaking the electrowcak symmetry, thus 
generating the masses of the gauge bosons and 
the fermions. The importance of the search for 
this missing element of the Standard Model is 
proved indeed by the fact that it is among the 
main motivations for the future colliders activ- 
ity. At LEP2 the Higgs could be discovered for 
Mh < 98 GeV Q; in case of heavier mass we 
will have to wait for the Large Hadron Collider 
(LHC) at CERN, whose energy in the centre of 
mass will be y/s = 14 TeV. 

The experimental observation of the Higgs will 
be a difficult challenge especially because, as 
the SM predicts and detailed studies have con- 
firmed the signal, i.e. events characterized 
by the production of the Higgs boson, will be 



*Project supported by Istituto Nazionale di Fisica Nucle- 
are (INFN) 

tAuthor address: Dipartimento di Fisica, Univ. Trento, 
1-38050 Povo (TN) Tel:+39-461-881530, fax:+39-461- 
882014, e-mail: dusini@science.unitn.it 



overwhelmed by background events, with multi- 
hadron production induced by strong interactions 
of quark and gluons. With this work we want 
to show that an artificial neural network (ANN) 
trained with a suitable choice of the input vari- 
ables might be a valid tool to enhance the signal 
to background ratio. We consider the extraction 
of Higgs events from backgrounds in simulated 
data at LHC energies. In particular we consider 
two cases: in the first, shortly called off-line, a 
maximal set of information on each event is avail- 
able; in the second, called on-line, only the knowl- 
edge about the transverse momenta of final state 
muons is available, as it is the case with the CMS 
muon spectrometer fj"l|| . 

2. TOTEM and RTS 

Neural networks, implemented as VLSI hard- 
ware, are being considered as good candidates 
to solve problems of time-critical and high qual- 
ity pattern recognition in High Energy Physics 
(HEP)||-||]. The main benefit is speed, because 
of the massive parallel architecture. The cost is 
usually a very complex architecture, since com- 
mon algorithms such as back-propagation, be- 
ing derivative-based, require high precision com- 
putation ||. On the contrary the neurochip 
Totem has a simple structure as it implements 
a " derivative- free" algorithm, based on an ap- 
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proach to the training problem, where this is first 
transformed into a combinatorial optimization 
task and then solved by means of the heuristic 
method called Reactive Tabu Search (RTS)[f|§. 
RTS builds up search trajectories in the space of 
the binary strings of length L = N*B, into which 
iV weights, needed to configure a neural network, 
are suitably coded using B bits per weight. The 
search is intended to locate the best "subopti- 
mal" minimum on a cost surface by means of 
a sequence of elementary moves, each consist- 
ing of a single bit-flip in the string of weights. 
When a move is done, its inverse is forbidden 
for a prohibition period of T successive steps (the 
Glover's Tabu Search method jiotl), allowing some 
amount of diversification in the search process. 
RTS remarkably enhances such diversification by 
dynamically adjusting the parameter T through 
a simple mechanism that evaluates and reacts to 
the current local shape of the cost surface. As 
a result RTS escapes rapidly from local minima 
and cyclings and finds solutions even for low pre- 
cision weights quite independently from any start- 
ing from starting point . 

3. Data selection and analysis 

At the energy of 14 TeV the dominant produc- 
tion mechanism of the Higgs in p — p collision is 
the gluon-gluon fusion. For Mjj ~ 200 GeV the 
Higgs particle decays predominantly into a vec- 
tor gauge boson pair (ZZ,WW). Despite of the 
smaller branching fraction (Tjj^ww /Fjj^zz ~ 
3) the so called gold plated channel 

p p — > H X — ► Z° Z a ; X — > n + n~ ii + X (1) 

provides cleaner signal with a narrow four lep- 
tons invariant mass peak that for Mjj > 400 GeV 
would be clearly distinguishable from ZZ contin- 
uum. As pointed out by several papers [HQ this 
channel can be exploit in the wide mass range 130 
GeV< Mjj < 800 GeV §JjJ (with one Z being 
virtual for M H < 180 GeV). For Mjj > 180 GeV 
up to 400 GeV this channel is sensitive even at 
luminosities as low as ~ 10 4 p6 _1 |y]]. Thus we 
considered ([!]) as the signal in our simulation as- 
suming a mass value of 200 GeV. In this case the 



main sources of background are the tt production: 

pp^tiX -> fi + fi~fi + fi~ X , (2) 

with the 4 muons arising from semileptonic decay 
of the top and antitop, and the Zbb production: 

p p -> Z°bb -> fi + n~ fi + (i~ X , (3) 

with a muon pair arising from Z° decay and the 
other one from semileptonic b and b decays. The 
cross sections for the three processes as calculated 
by the PYTHIA 5.7-JETSET 7.4 Monte Carlo 
code used to generate the data are: 

a(pp -> HX — > ZZ — > 4/iX') = 2.7 • 10~ 3 pb (4) 

o(pp -> tt X -> Afi X ) = 7.1 pb (5) 

a{pp — > Z°bbX — > 4/xX') = 5.7 pb. (6) 

The signature of the channel (]]]) is characterized 
by two pairs with large transverse momen- 

tum and invariant mass close to Mz° . In addi- 
tion noticeably the production of hadrons is ex- 
pected to be different in the signal and the back- 
ground channels, due to a more copious gener- 
ation of them by hard parton scattering in the 
(||) and (0) processes as compared to p). How- 
ever the latter peculiar feature remains hidden be- 
cause of the huge number (typically several hun- 
dreds at the LHC energy) of hadrons produced 
by hadronization of the two remnant partons and 
by multiple interaction per beam crossing. Con- 
sequently, in order to remedy, we choose to pre- 
process the data by the so called k± clustering 
algorithm citekt. This algorithm consists of two 
steps. In the first one compares 

dij = 2 mm{E 2 Tl ,E 2 T] }^{ m - n 3 f + - ^) 2 
with 

diB = E Ti , 

where Exi is the transverse energy of the i th par- 
ticle with respect to the beam direction, rji is its 
pseudorapidity and fa is the azimuth angle with 
respect to the beam axis: a final state particle i 
is attributed to the beam remnants (beam jet) if 
diB is smaller than , otherwise it is attributed 
to a hard jet. In the second step, which is not of 
interest here, the particles belonging to hard jets 
are divided into different clusters. 
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3.1. Offline 

Of course the ability of a neural network in 
discriminating signal from background events lies 
in the optimal choice of the physical variables. 
In this off-line analysis we use the following ten 
physical observables: 

Xi — X4 The transverse momenta of the four 
muons. 

X§ — Xg, The invariant masses of the four dif- 
ferent pairs. 

Xg The four muons invariant mass. 

Xiq The hadron multiplicity related to hard 
jets obtained with the k± algorithm. 

3.1.1. Training and testing 

The neural network we have configured on the 
neurochip Totem for the present work is a 10-20- 
1 feed-forward architecture. It has been trained 
using 4000 Higgs events, mixed with 2000 ti and 
2000 Zbb events. Then it has been tested on a 
set of data completely different from the training 
one, made up according to the ratio of the cross 
sections of the three process (||), (||) and (||). 

The performance of the ANN has been evalu- 
ated by introducing the usual two variables pu- 
rity (P) and Higgs discrimination efficiency (e) 
defined as follows: 



P = 
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(7) 



where Njj is the total number of Higgs events in 
the testing sample, Nfj is the total number of the 
accepted (i.e. correctly identified) Higgs events 
and Ng is the total number of the accepted back- 
ground events, i.e. events that are incorrectly 
identified as Higgs events. One can make a pu- 
rity vs. efficiency plot by introducing a threshold 
parameter / in the dynamical range [0.1] of the 
ANN output, so that if the ANN output y for an 
event in the testing phase turns to belong to the 
subinterval I\ — [0, 1], then that event is classified 
as a signal, otherwise if it turns to belong to the 
subinterval I2 =]Z, 1], then that event is classi- 
fied as a background. Our results are reported in 
figure 1 , where they are compared with those ob- 
tained using a simulated neural network trained 
by a classical backpropagation algorithm O] for 
the same input variable |16|. 
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Figure 1: The purity versus the Higgs efficiency. 

3.2. On-line 

As already pointed out, a main benefit of 
the hardware implementation of the neural net- 
works is the speed: this is true for the neurochip 
Totem, while the faster ToTEM ++ is on the way 
to meet even the requirements of employ even at 
the future LHC jfjj. Taking this in mind, we ex- 
amine the case when only the knowledge of the 
transverse momenta of the final muons is given, 
with uncertainty (SPt ~ ±0.15 Pt), just like with 
the CMS muon spectrometer |llj. Moreover we 
set the two cuts on our four muon events, namely: 



|P f |>5GeV and \rj\ < 2.4 



(8) 



where r\ is the pseudorapidity of the muons. As 
input variables of our neural network we choose 
the following: 

X\ — X4 The transverse momenta of the four 
muons. 

A5 — Xg The transverse mass of the four dif- 
ferent pairs. 

Xg The transverse mass of the four muons sys- 
tem. 

A10 — An The eigenvalue of the transverse 
Parisi momentum tensor. 

The transverse mass of a set of particles with 
transverse momentum P^ is given by 

' n \ 2 / n 

Ei^i - £ 



Af t 2 =(£|P«|) -(E^l ( 9 ) 



while the transverse Parisi momentum tensor ITi 
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is defined as 

v-4 p (fc) p (fc) 



1,2. 



(10) 



where are the transverse components of the 
momentum of the k th muon in the lab frame. 

3.2.1. Training and testing 

For the present case, like for the previous off- 
line one, we have implemented a feed-forward 11- 
32-1 neural network architecture on the neurochip 
Totem and we have followed exactly the same 
procedures, apart from the introduction of the 
cuts (||). The (preliminary) results are shown in 
figure 1 together with the off-line ones. 

4. Conclusions 

We have shown that neural networks as imple- 
mented on the chip Totem exhibit considerably 
high quality and high speed performances, proba- 
bly not attainable by traditional statistical meth- 
ods. Therefore they should be seriously consid- 
ered and thoroughly investigated for effective use 
in physics experiments. We stress the fact that 
we gain a factor of about 10 3 4- 10 4 in the signal 
to background ratio. 

We wish to thank G. Nardulli, G. Marchesini 
and G. Busetto for useful discussions. 
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